Tutorial 1 (low-level variant): using a Quantum Device to extract machine-learning features
(download this tutorial here (external))
This notebook reproduces the first part of the QEK paper (external) using the library's low-level API.
By the end of this notebook, you will know how to:
- Import a molecular dataset (the library supports other type of graphs, of course).
- Compile a register and a pulse from each graph.
- Launch the execution of this compiled register/pulse on a quantum emulator or a physical QPU.
- Use the result to extract the relevant machine-learning features.
A companion notebook reproduces the machine-learning part of the QEK paper.
If you are not interested in quantum-level details, you may prefer the companion high-level notebook that mirrors this notebook, but using a higher-level API that takes care of all such issues.
Dataset preparation
As in any machine learning task, we first need to load and prepare data. QEK can work with many types of graphs, including molecular graphs. For this tutorial, we will use the PTC-FM dataset, which contains such molecular graphs.
# Load the original PTC-FM dataset
import torch_geometric.datasets as pyg_dataset
from qek.shared.retrier import PygRetrier
# We use PygRetrier to retry the download if it fails.
og_ptcfm = PygRetrier().insist(pyg_dataset.TUDataset, root="dataset", name="PTC_FM")
display("Loaded %s samples" % (len(og_ptcfm), ))
'Loaded 349 samples'
This package lets researchers embed graphs on Analog Quantum Devices. To do this, we need to give these graphs a geometry (their positions in, space) and to confirm that the geometry is compatible with a Quantum Device.
This package builds upon the Pulser framework (external). Our objective, in this notebook, is to compile graphs into the format understood by our Quantum Devices: a Pulser Register (the position of qubits) and Pulser Pulses (the laser impulses controlling the evolution of the analog device).
As the geometry depends on the Quantum Device, we need to specify a device to use. For the time being, we'll use Pulser's AnalogDevice
, which is
a reasonable default device. We'll show you a bit further how to use another device.
In this example, our graphs are representations of molecules. To simplify things, we'll use the dedicated class
qek.data.graphs.PTCFMGraph
that use bio-chemical tools to compute a reasonable geometry from molecular data using the PTCFM conventions for a specific
Quantum Device. For other classes of graph, you will need to decide how to compute the geometry and use qek.data.graphs.BaseGraph
.
from tqdm import tqdm
import pulser as pl
import qek.data.graphs as qek_graphs
graphs_to_compile = []
for i, data in enumerate(tqdm(og_ptcfm)):
graph = qek_graphs.PTCFMGraph(data=data, device=pl.AnalogDevice, id=i)
graphs_to_compile.append(graph)
Compile a Register and a Pulse
Once the embedding is found, we compile a Register (the position of atoms on the Quantum Device) and a Pulse (the lasers applied to these atoms).
Note that not all graphs can be embedded on a given device. In this notebook, for the sake of simplicity, we simply discard graphs that cannot be trivially embedded. Future versions of this library may succeed at embedding more graphs.
from qek.shared.error import CompilationError
compiled = []
for graph in tqdm(graphs_to_compile):
try:
register = graph.compile_register()
pulse = graph.compile_pulse()
except CompilationError:
# Let's just skip graphs that cannot be computed.
print("Graph %s cannot be compiled for this device" % (graph.id, ))
continue
compiled.append((graph, register, pulse))
print("Compiled %s graphs into registers/pulses" % (len(compiled, )))
Graph 1 cannot be compiled for this device Graph 16 cannot be compiled for this device Graph 23 cannot be compiled for this device Graph 25 cannot be compiled for this device Graph 26 cannot be compiled for this device Graph 34 cannot be compiled for this device
Graph 40 cannot be compiled for this device Graph 43 cannot be compiled for this device Graph 53 cannot be compiled for this device Graph 58 cannot be compiled for this device Graph 60 cannot be compiled for this device Graph 61 cannot be compiled for this device Graph 62 cannot be compiled for this device Graph 65 cannot be compiled for this device Graph 68 cannot be compiled for this device Graph 78 cannot be compiled for this device
Graph 86 cannot be compiled for this device Graph 97 cannot be compiled for this device Graph 101 cannot be compiled for this device Graph 104 cannot be compiled for this device Graph 105 cannot be compiled for this device Graph 107 cannot be compiled for this device Graph 115 cannot be compiled for this device Graph 117 cannot be compiled for this device Graph 118 cannot be compiled for this device Graph 122 cannot be compiled for this device Graph 126 cannot be compiled for this device Graph 127 cannot be compiled for this device Graph 128 cannot be compiled for this device Graph 129 cannot be compiled for this device
Graph 132 cannot be compiled for this device Graph 135 cannot be compiled for this device Graph 144 cannot be compiled for this device Graph 155 cannot be compiled for this device Graph 157 cannot be compiled for this device Graph 165 cannot be compiled for this device Graph 166 cannot be compiled for this device
Graph 169 cannot be compiled for this device Graph 171 cannot be compiled for this device Graph 175 cannot be compiled for this device Graph 181 cannot be compiled for this device Graph 185 cannot be compiled for this device Graph 186 cannot be compiled for this device Graph 193 cannot be compiled for this device Graph 197 cannot be compiled for this device Graph 203 cannot be compiled for this device Graph 204 cannot be compiled for this device Graph 206 cannot be compiled for this device
Graph 208 cannot be compiled for this device Graph 214 cannot be compiled for this device Graph 215 cannot be compiled for this device Graph 220 cannot be compiled for this device Graph 224 cannot be compiled for this device Graph 238 cannot be compiled for this device Graph 239 cannot be compiled for this device
Graph 243 cannot be compiled for this device Graph 244 cannot be compiled for this device Graph 245 cannot be compiled for this device Graph 246 cannot be compiled for this device Graph 247 cannot be compiled for this device Graph 259 cannot be compiled for this device Graph 260 cannot be compiled for this device Graph 264 cannot be compiled for this device Graph 268 cannot be compiled for this device Graph 269 cannot be compiled for this device Graph 270 cannot be compiled for this device Graph 273 cannot be compiled for this device Graph 278 cannot be compiled for this device Graph 279 cannot be compiled for this device Graph 281 cannot be compiled for this device
Graph 284 cannot be compiled for this device Graph 313 cannot be compiled for this device
Graph 319 cannot be compiled for this device Graph 327 cannot be compiled for this device Graph 333 cannot be compiled for this device Graph 338 cannot be compiled for this device Graph 342 cannot be compiled for this device Compiled 272 graphs into registers/pulses
Let's take a look at some of these registers and pulses.
example_graph, example_register, example_pulse = compiled[64]
# The molecule, as laid out on the Quantum Device.
example_register.draw()
# The laser pulse used to control its state evolution.
example_pulse.draw()
Experimenting with registers and pulses
You may experiment with different pulses, by passing arguments to compile_pulse
.
example_pulse = graphs_to_compile[0].compile_pulse(normalized_amplitude=0.1, normalized_duration=0.1) # arbitrary values
example_pulse.draw()
You can experiment further, using arbitrary pulses and registers, but for this, you'll have to use the low-level Pulser framework, which goes beyond the scope of this tutorial. You may find further details on pulses and registers in the documentation of Pulser (external).
Executing the compiled graphs on an emulator
While our objective is to run the compiled graphs on a physical QPU, it is generally a good idea to test out some of these compiled graphs on an emulator first. For this example, we'll use the QutipEmulator, the simplest emulator provided with Pulser.
from qek.data.processed_data import ProcessedData
from qek.target.backends import QutipBackend
# In this tutorial, to make things faster, we'll only run the graphs that require 5 qubits or less.
# If you wish to run more entries, feel free to increase this value.
#
# # Warning
#
# Emulating a Quantum Device takes exponential amount of resources and time! If you set MAX_QUBITS too
# high, you can bring your computer to its knees and/or crash this notebook.
MAX_QUBITS = 5
processed_dataset = []
backend = QutipBackend(device=pl.AnalogDevice)
for graph, register, pulse in tqdm(compiled):
if len(register) > MAX_QUBITS:
continue
states = await backend.run(register=register, pulse=pulse)
processed_dataset.append(ProcessedData.custom(register=register, pulse=pulse, device=pl.AnalogDevice, state_dict=states, target=graph.target))
As mentioned, there are limits to what an emulator can do.
Pasqal has also developed an emulator called emu-mps, which generally provides much better performance and resource usage, so if you hit resource limits, don't hesitate to check it out (external)!
Executing compiled graphs on a QPU
Once you have checked that the compiled graphs work on an emulator, you will probably want to move to a QPU. Execution on a QPU takes resources polynomial in the number of qubits, which hopefully means an almost exponential speedup for large number of qubits.
To experiment with a QPU, you will need either physical access to a QPU, or an account with PASQAL Cloud (external), which provides you remote access to QPUs built and hosted by Pasqal. In this section, we'll see how to use the latter.
If you don't have an account, just skip to the next section!
HAVE_PASQAL_ACCOUNT = False # If you have a PASQAL Cloud account, fill in the details and set this to `True`.
if HAVE_PASQAL_ACCOUNT:
from qek.target.backends import RemoteQPUBackend
processed_dataset = []
# Initialize connection
my_project_id = "your_project_id"# Replace this value with your project_id on the PASQAL platform.
my_username = "your_username" # Replace this value with your username or email on the PASQAL platform.
my_password = "your_password" # Replace this value with your password on the PASQAL platform.
# Security note: In real life, you probably don't want to write your password in the code.
# See the documentation of PASQAL Cloud for other ways to provide your password.
# Initialize the cloud client
backend = RemoteQPUBackend(username=my_username, project_id=my_project_id, password=my_password)
# Fetch the specification of our QPU
device = await backend.device()
# As previously, create the list of graphs and embed them.
graphs_to_compile = []
for i, data in enumerate(tqdm(og_ptcfm)):
graph = qek_graphs.PTCFMGraph(data=data, device=device, id=i)
graphs_to_compile.append(graph)
compiled = []
for graph in tqdm(graphs_to_compile):
try:
register = graph.compile_register()
pulse = graph.compile_pulse()
except CompilationError:
# Let's just skip graphs that cannot be computed.
print("Graph %s cannot be compiled for this device" % (graph.id, ))
continue
compiled.append((graph, register, pulse))
# Now that the connection is initialized, we just have to send the work
# to the QPU and wait for the results.
for graph, register, pulse in tqdm(compiled):
# Send the work to the QPU and await the result
states = await backend.run(register=register, pulse=pulse)
processed_dataset.append(ProcessedData.custom(register=register, pulse=pulse, device=device, state_dict=states, target=graph.target))
There are other ways to use the SDK. For instance, you can enqueue a job and check later whether it has completed. Also, to work around the long waiting lines, Pasqal provides high-performance distributed and hardware-accelerated emulators, which you can access through the SDK.
For more details, take a look at the documentation of the SDK (external).
...or using the provided dataset
For this notebook, instead of spending hours running the simulator on your computer, we're going to skip
this step and load on we're going to cheat and load the results, which are conveniently stored in ptcfm_processed_dataset.json
.
import qek.data.processed_data as qek_dataset
processed_dataset = qek_dataset.load_dataset(file_path="ptcfm_processed_dataset.json")
print(f"Size of the quantum compatible dataset = {len(processed_dataset)}")
Size of the quantum compatible dataset = 279
A look at the results
Let's take a look at one of our samples:
# The geometry we compiled from this graph for execution on the Quantum Device.
dataset_example: ProcessedData = processed_dataset[64]
dataset_example.draw_register()
# The laser pulses we used to drive the execution on the Quantum Device.
dataset_example.draw_pulse()
The results of executing the embedding on the Quantum Device are in field state_dict
:
display(dataset_example.state_dict)
print(f"Total number of samples: {sum(dataset_example.state_dict.values())}")
{'00100000100': 15, '00100010010': 13, '10100100001': 7, '10000100000': 2, '10000000010': 29, '10000001010': 43, '01000000000': 20, '10000000000': 33, '10100001011': 3, '00001010001': 2, '01000001010': 9, '01000000100': 7, '00110000000': 6, '00100101010': 2, '10000000001': 13, '10010101100': 3, '01000010001': 8, '00000000000': 11, '00100000010': 21, '00100001100': 24, '01001010010': 2, '10000001001': 13, '00110001010': 15, '00101000010': 3, '00100010001': 4, '00110010010': 9, '10001001000': 4, '00100100010': 3, '00100000001': 6, '01000010010': 17, '10100001000': 8, '10110000100': 2, '10000010000': 11, '00010000000': 3, '00101001000': 2, '00100000000': 40, '00110000010': 11, '00100100011': 5, '10010000000': 17, '00100001010': 38, '10000001000': 16, '10001010011': 1, '10001010010': 5, '10000001100': 16, '10110000000': 7, '10010010010': 6, '00100001001': 19, '10010000010': 7, '00101001011': 3, '00101000100': 1, '10101001001': 5, '10100100000': 2, '10000010010': 20, '10000101011': 3, '00010101100': 5, '00000101000': 2, '00000010000': 5, '00101010000': 6, '00001001001': 1, '01001010000': 4, '00110001000': 5, '01001000011': 2, '00100100001': 2, '01000101010': 3, '10110100000': 3, '10100100010': 1, '01000100100': 4, '01100000010': 1, '10110001100': 2, '10010000100': 2, '00100001000': 18, '10110100100': 2, '00000001010': 10, '00101010010': 1, '10000000100': 15, '00000001100': 3, '10110001000': 5, '10100000011': 2, '10110010000': 5, '00000000100': 3, '10000100011': 3, '00010101000': 3, '01001001011': 1, '00101001001': 1, '10000010001': 2, '10000101001': 1, '00100010000': 7, '10100001010': 8, '10100100011': 3, '10100010010': 3, '01000001100': 9, '00110001100': 2, '10100010000': 2, '10000000011': 5, '10101000001': 3, '10100000000': 11, '00000100000': 6, '10110100010': 4, '00000101010': 2, '10101010011': 6, '00100000011': 4, '10010001010': 14, '00101000011': 2, '10010001000': 2, '00000001000': 7, '01001000000': 1, '00001010000': 2, '00110101010': 1, '01001001001': 1, '00100101001': 2, '10100101001': 2, '00010001000': 2, '10100001100': 5, '10000100100': 2, '10101000000': 2, '00110000100': 2, '10010010000': 2, '01000010000': 9, '10101001010': 1, '01000100000': 2, '00101010001': 1, '00001000001': 3, '01001001010': 2, '00000100011': 2, '00100001101': 1, '10000101010': 4, '10000001011': 3, '00100100000': 2, '00000001001': 1, '00100110001': 1, '00110010000': 2, '10010001100': 1, '00010100100': 3, '00000100001': 2, '01000001001': 5, '01000001000': 8, '10100000010': 2, '00000001011': 1, '10001000010': 1, '10110101000': 1, '00110100010': 5, '01000000010': 11, '00000000010': 3, '10010101000': 1, '00010100010': 2, '10100010001': 1, '10001000000': 3, '01000100010': 2, '00000010010': 2, '10101001011': 1, '10001010000': 2, '00001001000': 1, '00110100000': 4, '10010100000': 1, '00101001010': 2, '01000101000': 1, '00101010011': 1, '10100001101': 1, '01001010011': 2, '01001010001': 1, '01001101010': 1, '10100000100': 2, '01001000001': 3, '10101010001': 4, '10110011100': 1, '01000000001': 1, '00110000011': 1, '00100101011': 3, '00000100010': 1, '10000101000': 1, '00001010011': 1, '10001001010': 4, '00001001010': 1, '10010100010': 3, '01000101011': 1, '00001000011': 1, '10001001011': 2, '00101000001': 1, '10001000011': 1, '01000010011': 1, '10001001001': 1, '01001001000': 1, '10110000010': 1, '00100010011': 1, '00101001110': 1, '10001000001': 2, '00000000011': 1, '10000101100': 1, '00101000000': 1, '00100001011': 1, '10100000001': 1, '00100101000': 1, '00010001100': 1, '00010010000': 1, '01000000011': 2, '10000100010': 1, '01000010110': 1}
Total number of samples: 1000
This dictionary represents an approximation of the quantum state of the device for this graph after completion of the algorithm.
- each of the keys represents one possible state for the register (which represents the graph), with each qubit (which represents a single node) being in state
0
or1
; - the corresponding value is the number of samples observed with this specific state of the register.
In this example, for instance, we can see that the state observed most frequently is 10000001010
, with 43/1000 samples.
Note: Since Quantum Devices are inherently non-deterministic, you will probably obtained different samples if you run this on a Quantum Device instead of loading the dataset.
Machine learning-features
From the state dictionary, we derive as machine-learning feature the distribution of excitation. We'll use this in the next notebook to define our kernel.
dataset_example.draw_excitation()
What now?
What we have seen so far covers the use of a Quantum Device to extract machine-learning features.
For the next step, we'll see how to use these features for machine learning.